These packages can get you started when working on data visualization as a beginner.
Use function **install.packages(“Name_of_package”“)** in console to install the package (you only need to do this once)
Call the package using library(Name_of_package) at the beginning of the script to use the package
Dplyr is awesome to manipulate data structure in an easy and basic way!
Some functions include:
mutate() adds new variables that are functions of existing variables
select() picks column based on their name
filter() picks row based on their value
summarise() reduces rows to a single summary
arrange() changes row ordering
See documentation here
dplyr
Tidy allows you to organize your data structures in a organized way!
Some functions include:
gather() takes columns and gathers them into rows
spread() takes two columns and spreads into multiple columns
See documentation here
ggplot2 is the best package gor creating beautiful visualizations super easily!
How to Use:
See documentation here
ggplot
ggplot
Magrittr allows you to perform functions more efficiently
This package allows you to use piping in your code.
Piping uses the syntax %>% which takes one function or variable and pipes it in to be the first parameter of the second function.
An Example: Function 1 (2 * 4) %>% Function 2 (+ 3) = 11
See documentation here
library(ggplot2)
library(dplyr)
library(tidyr)
library(magrittr)
Adding an additional colum titled time my dividing every value of dist by every value of speed
knitr::kable(head(cars, 5))
knitr::kable(mutate(head(cars, 5), time = round((dist/speed), digit = 2)))
| speed | dist |
|---|---|
| 4 | 2 |
| 4 | 10 |
| 7 | 4 |
| 7 | 22 |
| 8 | 16 |
| speed | dist | time |
|---|---|---|
| 4 | 2 | 0.50 |
| 4 | 10 | 2.50 |
| 7 | 4 | 0.57 |
| 7 | 22 | 3.14 |
| 8 | 16 | 2.00 |
Rearranging table to be grouped by ID with the extra amount of sleep becoming the value for the columns of groups
knitr::kable(sleep)
knitr::kable(sleep %>%
spread(group, extra))
| extra | group | ID |
|---|---|---|
| 0.7 | 1 | 1 |
| -1.6 | 1 | 2 |
| -0.2 | 1 | 3 |
| -1.2 | 1 | 4 |
| -0.1 | 1 | 5 |
| 3.4 | 1 | 6 |
| 3.7 | 1 | 7 |
| 0.8 | 1 | 8 |
| 0.0 | 1 | 9 |
| 2.0 | 1 | 10 |
| 1.9 | 2 | 1 |
| 0.8 | 2 | 2 |
| 1.1 | 2 | 3 |
| 0.1 | 2 | 4 |
| -0.1 | 2 | 5 |
| 4.4 | 2 | 6 |
| 5.5 | 2 | 7 |
| 1.6 | 2 | 8 |
| 4.6 | 2 | 9 |
| 3.4 | 2 | 10 |
| ID | 1 | 2 |
|---|---|---|
| 1 | 0.7 | 1.9 |
| 2 | -1.6 | 0.8 |
| 3 | -0.2 | 1.1 |
| 4 | -1.2 | 0.1 |
| 5 | -0.1 | -0.1 |
| 6 | 3.4 | 4.4 |
| 7 | 3.7 | 5.5 |
| 8 | 0.8 | 1.6 |
| 9 | 0.0 | 4.6 |
| 10 | 2.0 | 3.4 |
Creates a scatter plot of iris data
plot <- ggplot(iris) +
geom_point(aes(x = Petal.Length, y = Petal.Width, color = Species)) +
scale_color_brewer(palette="Set2") +
labs(title = "Petal Length VS Petal Width of Different Iris Species")
print(plot)